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ABSTRACT 


Described are the modular and power series methods for solving the system of linear equa- 
tions 
A(z) x(z) = b(z), 
where the components of A(z) and b(z) are polynomials with coefficients over the integers modulo 
p. Theoretical cost estimates for obtaining the solution in rational form are shown to be 
4a2NS + 2 ant and 1032N? operations for each of these methods, respectively, where N is the 


order of the system and @ is the maximum degree of the component polynomials. These estimates, 
together with empirical timing results, are used to identify classes of problems for which each 
method is superior. Finally, the results of the modular and power series methods are used to 
describe an algorithm for solving systems of linear equations with component polynomials over the 


integers. 
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CHAPTER 1 


Introduction 


Let Z denote the integers, a the integers modulo the prime p, Z[z] the polynomials in z 


with integer coefficients, and Tey the polynomials in z with coefficients in oe The primary pur- 


pose of this thesis is the comparison of methods for symbolically solving the linear system of equa- 
tions 

A(z) x(z) = b(z), (1.1) 
where A(z) is a matrix of order N and b(z) is a vector of length N with components being polyno- 
mials in z. 

The established method of solving (1.1) is the modular method, described in careful detail by 
McClellan[{13]. This method is essentially one of determination of the value of the solution at cer- 
tain evaluation points, followed by interpolation over these points to yield, finally, the solution in 
rational form. Extensive theoretical and empirical investigations have been performed comparing 
the modular method for solving (1.1) with other viable methods, such as the Gaussian Elimination 
(exact-division) algorithm, the two-step, fraction-free method, and the multi-step methods (see 
McClellan[14], Bariess[2], Cabay and Lam[8] and Higginson[10] ). The conclusions reached are 
that, as a general problem solving algorithm (that is, for random A(z) and b(z)), the modular 
method is clearly superior. Consequently, in this thesis, from al] these methods, only the modular 
method is given consideration. 

In this thesis, the modular method is compared theoretically and empirically with the power 
series method, introduced recently by Cabay{5]. This method essentially determines the derivatives 
of the solution at a single evaluation point, from which the power series form, and eventually the 


rational function form of the solution is then constructed. 
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The thesis commences by describing in Chapter 2, an established method for solving systems 
Z 


over ea These solutions are basic in the development of both the modular and power series 
methods for solving systems over Zz} 


(p) 


Chapter 3 describes the modular method for solving systems over ae Included in the 


development are detailed cost estimates and a new termination criterion. 


Chapter 4 describes the power series method for solving systems over ae In addition to 


deriving cost estimates, some intrinsic problems for the implementation of the method are 
highlighted. 

The major purpose of this thesis is a theoretical and empirical comparison of costs of the 
modular and power series methods in order to determine classes of problems for which each is 
superior. Chapter 5 describes the implementation of these methods and the testing procedures 
used. Empirical results confirm the theoretical cost estimates of Chapters 3 and 4 and identify 
practical trade-offs between the methods. 

The final chapter uses the the methods of Chapters 3 and 4 to solve systems over Z[z]. Hav- 


ing obtained the solutions of (1.1) over an for several primes p,, i = 0,1, ...,+y, the Chinese 


Remainder Theorem permits the construction of the solution over Z{z]. Cost estimates are 


obtained, problems with bad primes are identified, and a new termination criterion is derived. 
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CHAPTER 2 


The Adjoirt Sclution over Le 


(p) 
2.1. Introduction 
Given the matrix A of order N with entries over ey let A*” denote the adjoint of A. 
Then, 
AW A=d I, (2.1) 


where d is the determinant of A and I is the identity matrix of order N. 


For the system of linear equations 


Ax=b, (2.2) 
the adjoint solution is defined to be the pair (y,d) where 


y = At4%b, (2.3) 
If d + O, then from (2.1) and (2.2), it follows that 


x = yld. (2.4) 
On the other hand, if d = 0, then x may not exist, whereas the adjoint solution always does. 


In subsequent chapters, it is the adjoint solution that is usually required. 


To obtain the adjoint solution, a variant of the Gaussian elimination method (see Forsythe 


and Moler [9] and Lipson [11]) is used. The method consists of three steps. 


In the first step, forward elimination yields the triangular decomposition 


PA = LU, (2.5) 
where P is a permutation matrix, L is a lower triangular matrix of multipliers and 
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is the upper echelon form of A. 
In the second step, forward substitution is used to find a vector b’ such that 


Lb’ = Pb. (2.7) 
A solution of (2.7) always exists since L is a lower triangular matrix with 1’s along the diagonal. 


It now follows from (2.3), (2.5) and (2.7) that 


y= Ata b (2.8) 


(P- y u)~ b 


urd (L-1(p-)4! b) = U4 (det(P)-L-*P 2B) 
=det(P) U%! b' 
and, in addition, that 
N 
d = det(P) det(U) = det(P):-[[ uy (2.9) 
joi 
where det(P) = +1. Thus, except for the sign of det(P), the adjoint solution of (2.2) is also the 
adjoint solution of 
Ux=b’. (2.10) 
It remains, therefore, in the last step to solve (2.10) for (y,d). If rank(U) = N, then x is 


obtained from (2.10) by back substitution and then 


y=dzx (AT) 
follows. If rank(U) < N—2, (i.e. the last two rows of U are zero) then replacing any column of U 


ari eet). # of 


* ,66e ui) 


Aw orn gists ape OO 
i duty Aecet® © 08 re gay GB aelten> :! Towra oa terys of @ 


3) ao | 
anh ahitine i do men nghh ted Oa Sera cries apo (Lee 


anit) aw (85' (Vl net ad aa 
na bt 
me 


faay=siermea}ins = (oY a) le 


- . 
‘4m 4 
ioe 7 


62.5) ha (ki pee 
Drs A b Ghee rAe tGRR ages wh ere awl 1 aC 


’ 


by b ’ results in a matrix with rank at most N-1. It now follows by Cramer’s rule that 


y=0. (2.12) 
If rank(U) = N-1 (i.e. the last row only of U is zero), then Cramer’s rule for the last N-k+1 com- 
ponents of y yields 
acyl F2 N 
ve = ey I[4y TT 4-1) On 2343) 
=1 @k+1 

and 

ya OM = bt e+ een (2.14) 


where k is as in (2.6). The remaining components of y are obtained by back substitution applied 


to 
Uy=db' =0. (2.15) 
That is, 
k 
y= —uy? DIN IN Ty, Pec ied tae Sa? Shee ace (2.16) 
joi+l 


2.2. Cost Analysis 
The number of arithmetic operations required to find the adjoint solution (y,d) to (2.2) is 
counted as a measure to the cost of finding the solution. In the count, only the dominating terms 


are retained. 


3 
The first step is to find the triangular decomposition (2.5). This step requires a operations 


(counting additions and multiplications)(see Atkinson [1]). 

The forward substitution of the second step to solve (2.7) requires N? operations. 

The third step involves finding the adjoint solution to (2.10). If d # 0, then the back substi- 
tution requires N* operations. The calculation of the determinant, d, and its subsequent use in 
(2.11) to find y requires 2N operations. Thus the total number of operations to find the solution 


(y,d) for (2.10) and (2.11) when d # 0 is N’. If d = 0, then to find (2.12) or (2.13) and (2.14) 
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requires at most N operations and the subsequent back substitution to find (2.16) requires at most 
k operations. Thus, since k < N, the back substitution requires at most N* operations to find the 


adjoint solution to (2.10). 


Thus the total cost of finding the adjoint solution to (2.2) requires 


ae (2.17) 
3 
operations. To find the solution for another b when the triangular decomposition (2.5) is available 
requires only 


additional operations for each b. 
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3.1. Description 
Let A(z) be a matrix of order N and b(z) be a vector of length N, where the elements of 


A(z) and b(z) are univariate polynomials of degree at most 4, whose coefficients are over at the 


integers modulo the prime p. This chapter describes the modular technique for finding the adjoint 


solution (y(z),d(z)) where 


y(z) = AZ) bz) (3.1a) 
and 

d(z) = det (4@). (3.1b) 
Then 

A(z) y(z) = d(z) b(), (3.2) 
where y(z) is a vector of length N of polynomials over a All arithmetic considered in this 


Z 
chapter is done over ——. 
= () 
The modular methods make use of the Chinese Remainder Theorom (CRT) for polynomials 


to find the solution over the polynomia!s over a To use the CRT, the system (3.2) is first 


solved modulo the prime polynomial m,(z) to obtain the adjoint solution (TeI.4@ of 


Aiz) yz) = d,(z)-b,(z) mod m(z), (3.3) 
where 
yilz) = y(z) mod m(z) (3.4) 


d,(z) = d(z) mod m,(z) 
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Aj(z) = A(z) mod m,(z) 
b,(z) = b(z) mod mz). 
These residuals, y,(z) and d,(z), for i = 0,1, .. . ,y, are then used to reconstruct the solution 


(y(z),d(z)) via the CRT, where - is such that 


deg timc) = 6, (3.5) 


5 being the maximum degree of the solution’s polynomials. 


Since the linear polynomials, (: = 2), z € a are irreducible over nae the prime poly- 
nomials m,(z) are set to m,(z) = (: = 2), and y is set to y = 6 in (3.5). Using these linear poly- 
nomials (: = zi). the next step is to solve (3.2) modulo (: = 2). But finding a polynomial’s value 
modulo a linear polynomial, (- = zi), is equivalent to evaluating that polynomial at the point z,. 
This implies that 

A(z) mod {z - z) = A(z) (3.6) 
b(z) mod (2 - 21) = b(z) 
d(z) mod (- = z) = d(z,) 


and 


y(z) mod (z - z,) = y(z) 


for each polynomial m,(z) = (z - 5 sft 0,1 er Ss 


Therefore, to find the solution (siz). (z)|, the system [A(z),b(z)] is evaluated at the point 
aS ; Z : 
z, and the resultant system A(z,) yi(z) = 4,(z) b(z)) ts solved over ey The solution to the system 


over a has results whose elements are in ai as well, and as such 


yz) = (3.7) 


Ay “ae 
._ 
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d(z) = d,. 
These y, and d, for the different prime polynomials m(z) = (: oo 2] are then the residuals 
used by the CRT to reconstruct the polynomials of the solution. 


A point to note is that these residuals y, and d, are exactly the values of their respective poly- 
nomials y(z) and d(z) evaluated at the point z, and therefore are interpolation points of the polyno- 
mials. As well, the polynomials can be constructed uniquely up to a degree § whenever their values 
are known at §+1 distinct evaluation points. So the application of the CRT for the reconstruction 
of the polynomials is basically the interpolation of the polynomials from their values at these 
points. 

The Newtonian form of the CRT is used for the reconstruction of the polynomials 
(v@), a(@) from their residuals jj and dj, i= 0,1,...,8. This algorithm, explained below in 
terms of d,, to find the polynomial d(z), can be applied for each element in the vector y(z). The 


algorithm defines the resultant polynomial d*!(z) of degree k, at the kth stage, iteratively as 


k-1 
d8(2) = dQ) + a T(z - 2), (3.8) 
i=0 
where ue k = 0,1,2, ...,8 and 
dy = A(z) = dy (3.9) 


A (4 - d*-"(z) } x S, mod (: ~- Zi). 


The S, are defined as 
k-1 =H t—1 ae 
St = Il («.-z) = Il (2-2) mod p. (3.10) 
i=0 i=0 


In the implementation of this algorithm with the maximum degree of the solution known, the S,’s 


can be calculated prior to the initiation of the algorithm and stored for use provided the z, are 


known. 
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3.2. Implementation of the Modular Technique 


The first step in the implementation is to find a bound on the degree of the adjoint solution 
(3.11). The bound, 8, put forth by McClellan[13], is defined as 


N 
$= e a 33 = MIN (8,81, bel By) (3.11) 


i=l 


where 6,, i = 1,2, ...,N, is the maximum degree of an entry in column i of A(z) and &, is the 


maximum degree of the entries in the right hand side, b(z). 


The next step is to set the evaluation points z, i = 0,1, . . . ,8. For this implementation, the 
z; were Set to be z; = i, i = 0,1, . . . ,8. With the z, so set, the 5, values become 
S, = Té-0)" - 1 =(k!)"' mod p, (3512) 
ie = 
FOr eli 2 occ. oy On 


The third step is the evaluation of the matrix A(z) and the vector b(z) at z, to get the 


integer system [ae | such that 


Ay = A(k) 
by = b(k) 
for k = 0,1, ...,8. The evaluation of matrix A(z) and the vector b(z) at the point z, is straight- 


forward and is done using Horner’s method. 


The fourth step is solving A, x, = 5; to obtain the adjoint solution 


At Ye = & by mod p, (3.13) 
forks 0 1c 0. 
The adjoint solution of (3.13) over a is found using the LU decomposition procedures of 


Chapter 2. First, the decomposition into triangular matrices is applied to A,, then the solution is 
extracated from this triangular decomposition and the right hand side b,. The decomposition into 


triangular matrices returns the determinant, d,, of AS If As is singular then back substitution 
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returns the adjoint solution vector, y,, so that (x. 0} . On the other hand, when A, is non- 
singular, the triangular decomposition process will return d, # 0. Thus, to get the adjoint solution 
vector, ve , the result , x,, from the application of forward and back substitution, must be multiplied 
by d. 

The fifth step is the application of the CRT (3.9) to the residuals (j7,d) for 


i= 0,1, ...,8. This yields 
8 k-1 ‘ 
Vey =o: AL e—1) (3.14) 
k=0 i=0 
8 k-1 
d(z) = Sd’, [[ @-i), 
k=-0 i=0 


where y’, is a vector of length N with elements in os and d’,e 


(p) 


ZL 


yi Equation (3.14) is called the 


mixed radix representation of the adjoint solution. 


In the sixth and final step, this mixed radix representation (3.14) is transformed into the 


more usual fixed radix form 
8 
y(z) = Sz" (3.15) 
k=0 


8 
d(z) = Dy) tal pa 
k=0 


where y, is vector of length N with elements in 2 and d,e-Z. 


(p) (p) 


3.3. Cost Analysis 

To analyse the cost of obtaining the adjoint solution @.4@)) to satisfy (3.2), the measure 
used is the number of arithmetic operations required to find the solution with only the dominating 
terms retained. 

The cost of precalculating the S, for the CRT using (3.12) requires 5 operations where 6 is 


the bound on the degree of the solution’s polynomials. 
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The cost of evaluating the matrix A(z) and the vector b(z) at the §+1 evaluation points, 


where Horner’s rule is used, is 288” operations. 


Calculating the §+1 adjcint solutions to the systems over using the procedure of 


BZ 
(p)’ 
Chapter 2, requires San Operations. 

To calculate the coefficient, a,, of the term at the k* stage of the CRT application, using 
(3.9) and the precalculated S,, involves the evaluation of the polynomial d*~"!(z) of degree k-1 at 
the point z,, plus 2 operations to find a. Thus, to find the coefficient a, requires 2k operations. 
Therefore, the total number of operations for the application of the CRT to the N+1 elements of 
the solution to get polynomials of degree § requires 8’N operations. 

The last count to consider is that required to transform the solution from the mixed radix 


form (3.14) to the fixed radix from (3.15). This transformation requires 8’N operations. 


Therefore, the total operation count for finding the adjoint solution in the form (3.15), given 


the bound § < @N, has the dominating terms 
4g?N3 + Zam (3.16) 


3.4. Terminating the Iteration 

When calculating the terms of the polynomial solution up to the bound, 5, unnecessary calcu- 
lations are performed when the bounds are too pessimistic. The following theorem can be used to 
determine if sufficient terms of the polynomial solution have been found to guarantee that the 


adjoint solution 6.4) has been found such that, when d(z) # 0, 


x(2) = aa 3.17) 


The theorem is an extension to the polynomial case of the theorem presented by Cabay [4] 


for solving linear systems over the integers. 
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One point to notice about the theorem is that the coefficients of the polynomials need not be 


restricted to a field and may be over an arbitrary intergal domain. 


Tueorem 3.1 


Suppose A(z) = [a.()| is a matrix of order N and b(z) = Eo) is a vector of length N, 


where 
a,(z) = af) + af z+aP2+ --- +402 (3.18) 
b(z) = BO + BM 2+ bP r+ --- +0 2. 


are such that deg ( a,,(z) ] < and deg b,(2) } < dfor allij,O<ij <N. 
If the mixed radix representation of the adjoint solution (@),4@)) is 
yz) = (2) + O-(z - m) --- (e - 2) (3.19) 
#02 +0(2— 29) +++ (2 = tures} + duerarile): (6 - 2) -- > (2 - tus) 
d(z) = d(z) + 0-(z - m) --- (e- 2) 


stepeeneiss 1-1‘): (: = 20) eur hs (z oe Papel a dues ail): (z ri 20] Psi (: a Brel 


where 
yi(z) = yo + 1 (- ZZ 20) (3.20) 
a eah\ (ear ae oe(ereee 
dj(z) = dy + dy: (: = 20) 


WMbSa\boa) so enlb= 3) bean 


then 
A(z) y(z) = 4,(z) b(z). (3.21) 


Proof 
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Suppose the contrary, i.e. A(z) y,(z) — dz) b(z)#0. Then for some element of 


A(z) y(z) — d(z) b(z), there exists a nonzero polynomial p(z) such that 


0= |A@ y@ - 4@s@)], = [A@a.@ - 4@ 00], + PL. 
But 


[A@ » @ - 4@)b@), = [Ae n@ - a) 0], 
- [4@) Yu+a+i(Z) — du +a+i(z) b(2)],-((2 a Z0) ee (z 7 Za\|) 


Therefore, 

Bizje= (: - 20) ae (: a Zu) [4@) Yu +a+1(Z) — dy +a+i(Z) b(2)| 

= (: _ z9] aes (: — Zur» a) [3.246 [sural | = aah) 612]. 

Since p(z) # 0, 

3a,(2) | sorted], — ducsasse) (@) # 0 

j-1 
and its degree is = 0. This implies 

deg (p(z)) = dea of) [Eau [saroeale)] = dysarite) 70)| 


>M +a. 
Thus, 


deg (A(z) yi(z) - (2) b@)) Spee) 


But 


[K@) n@ - 4) de], = Bae) [1 ], - 4@ a0) 


where deg (| AC |) <M and deg d,(z) ) = M. Thus, 


deg [Eau [»@] - aes 2) 


fon 4 shes oeh. iej.ei9)* ite +4) » ~ [abs _ «8 


¢ 9 : 
athe = /: niiot- Co. ua) : 


Hewes Aid =a ITO — tne 


15 


Therefore, 


deg|A(z) (2) ~ 4(2) 6(@)] <M + a, 


and thus the contradiction. 
QED 


Thus, bounds for the degree of the adjoint solution are now unnecessary. All that is 
required to terminate the iteration is that a sufficient number of consecutive zero coefficients for 
all of the polynomials of the solution be encountered. If the bounds on the degree of the adjoint 
solution are tight then as many as d extra iterations are done to find a sufficient number of zeros 
to satisfy the theorem than would have been required using the bound. But when the bounds are 
not tight, the theorem can result in substantial savings by terminating the iteration long before the 


bounds would be reached. 


If at stage M+d + 1, where @ and M are defined as in Theorem 3.1, the theorem is applied 
to terminate the iteration of the algorithm, the result is the adjoint solution in the mixed radix 


form (3.19). Thus if d,(z)#0 then 


y,(z) _ y(z) _ ¥(Z) + Yu +a+i(Z) (: -y z0) as (- a Zy+5) 


le d(z)  d(z) — d,(z) + dy+a+i(z) (: oa 20) ws (: ~ ras) (3.22) 
where b@,4@)} is the complete solution to the system. Therefore, 
A(z) (y2) + years) (2 - 20) «+> (& - 2uea)}) (3.23) 


= (4(2) + dy+a+i(z) (- = 20) oe (- = ay! b(z). 
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This implies 
0= [A@ n@ - a2) 0()| (3.24) 
+ (ea) +> (& - cura) [A@ wera) ~ desert) 0@)]. 
Thus, 


O = A(z) yu+a+1(Z) — dy +a+1(z) d(z). 
As a result, (Yue a+12).dueeaer2)) is a solution to the system as well. Assuming dy. ;.;(z)#0, 


(for if it were then yy+3.1(z)=0, as well) and since d,(z)#0, this results in 


= ys (z) ae Yu +a+1(Z) 
De Test fhe Oy: Oe) 


Assuming that the GCD (.@).4,(2)) = 1 (if it is not, then remove the common factor), then 


ee) eee) look like 


Yu+a+i(Z) = M(z) y,(z) (3.26) 
dy +a+1(z) = M{z) d,(z) 
for some M(z). The result is that the complete solution, (2),d@)), is of the form 


yz) = w(z) (1+ M@ (ze - 20) --- (& - u+2)] (3.27 
d(z) = d,(z) (1 + M(z) (- = 20] Set (- = zu+s))- 

As will be seen later in Chapter 6, using the solution ( (z),d, (2)] can lead to problems when 
solving over the integers. Therefore, if it is possible to find the polynomial M(z) without further 
iteration of the algorithm, the entire solution { y(z),d(z) } could be constructed at this point. But 
(3.27) can be viewed as a system of N + 1 equations in the N + 2 unknowns y(z) (a vector of 
length N), d(z) and M(z). Thus M(z) cannot be solved for immediately. In addition, the following 


example illustrates that not enough information is available to determine M(z) directly. 
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Example 
Let 
z (z-1) -- - -7) z 
A(z) = (3.28) 
= (1 + (z-15)}(¢-8)(2—9) - - - (@-14) 
3 z (z—1) - - + (¢—9) + z (z—16)(z—17) 
b(z) = : 
(1 A (z—-15)} (@-8)(z-9) - +» (z—14)(2—16)(z—-17) — 3 (z—-8)(z—9) 


The solution x(z) is then 


3(Z—8)(Z—9) 
x(z) = ; (3.29) 
@-—16)(2—17) 
and the adjoint solution ((2),4@)) is given by 
S26) (2-9) 
yz) = (2 + (1+ @-15)) 2 @-1) --- @-14)) (3.30) 
(z—16)(z—17) 


d(@)= 2+ (1+ (2—15)) z (¢-1) - - - @-14). 
When solving the system using the modular techniques of this chapter, the complete adjoint 


solution (y(z),d(2)) , with coefficients in ay p = 45233, is 


y(z) = yz) + z ZN) + + G13) Yu rari) (3.31) 
d(z) = d,(z) + z (z—1) = + + (@—13)-dy+a+i@), 
where the application of Theorem 3.1 at stage 14 results in the solution ( y;(z), d,(z) i, in mixed 


radix form, of 


168 z — 422 (z-1) + 32 (z—1)(z—-2) 


y(z) = : 
240 z — 302 (z-1) + z (z-1)(z—-2) 


(3.32) 


dj(z) =z 
and with (suri), duasaxi(2)) as 
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Yut+a+i(z) = (2-14) (3.33) 


126 + 210 (z—15) + 51 (z—15)(z—16) + 3 (z—15)(z—16)(z-17) 
Xx 


2 — 2 (2-15) + (z-15)(z-16) + (z—-15)(z—16)(z-17) 


Guta) —(Z-14)7. 
Since the cco ( y,(z), d,(z)) =z, the reduced form of the solution (9,2), 4,2), when 


expanded, is 
37 —-51z+ 216 3 (z—-8)(z—9) 
yr (z) = 2) = = (3.34) 
2 — 332+ 272 (z—16)(z—17) 
a2) = 42 21, 
r z 
Using (3.33) and (3.34) to find the M(z) so that (3.26) holds results in 
M(z) = (z-14)?. (5:35) 


Therefore, there is no indication at stage M + @ + 1 that the factor M(z) is (z—14)?. One of 
the factors (z - 14) is due to the fact the non-zero portion of the remaining terms of the mixed 
radix solution does not start until 2 iterations after the current stage (i.e., the next iteration pro- 
duces zero coefficients, as well). Thus the multiple A/(z) is not immediately available and, if the 
entire solution { y(z),d(z) } is required, it will have to be calculated by the continued iteration of 


the algorithm beyond the stage at which the termination criteria can be applied. 
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CHAPTER 4 


Power Series Methods over "er 


4.1. Generation of the Power Series Solution 


Let A(z) be a matrix of order N and b(z) be a vector of length N, where the elements of 


A(z) and b(z) are polynomials of degree at most @ with coefficients in Z| This section describes 


(p) 
the generation of the power series solution x(z), of 
A(z) x(z) = b(z). (4.1) 
For the generation of the solution, A(zp), 2G is required to be non-singular, for some zp. 
Given the system A(z) and b(z) in the form 
C) 
k=0 
t) 
Bz =O Shi 2, 
k=0 
where A’, is a matrix of order N and b’, a vector of length N, with entries in = the first step is 


to find the element z) such that A(z)) has rank N. As pointed out by Cabay [5], if A(zo) is singular 
for §+1 distinct points z), where 8 is a bound on the degree of the determinant of A(z), then A(z) 


mod p is singular. To facilitate finding the matrix A(zo), the system (4.2) is put into the form 


A(z) = SA (:- z0) 5 (4.3) 


8 k 
b(z) = Db (z - x] 
k=0 
where the elements of A, and by are in or Now, with A(z)) = Ao, the LU decomposition of 


Chapter 2 is applied to determine the rank of Ag. If the rank is not N, the process of transforming 


19 


etce? --t ut wei 


ee ee ee ee 


CU Pood yeas, Ore pas fF) tics Clee? Oe OFFI =7av: ka 


i? Ae 


e! re °em> > Hi oil > 7 . 
 « ao 1 AMS i% 


ie if , s Bi wif) @ , a =, » ol i Fi 
(xi ee eA S> Orgreetsp aD We Srey th ah. By: fo: 


ed a) ani pane © U pte £9 / wih A. 


ial é “~ ie 
A aT 
jt eet eee 
oe 


a 


eee 


the system (4.2) to the form (4.3) for another z) and determining the rank of the resultant constant 
matrix Ay is repeated. If the rank is N then a non-singular A(z)) has been found and the 
transformed system (4.3), for this particular zo, is used to find the power series solution to (4.1). 


This is done using the following theorem discussed by Cabay[5] to find the coefficient x, of 


« k 
x(z) = Dy [Zz — 2] . 
Pe  - 4) (4.4) 
Theorem 4.1 
min(k,a) bh Osksa 
Aw =—- > Ay, + (4.5) 
i-1 0 a<k 
ge 


This method is iterative and uses, at most, the last d terms A;, b;, and x, to compute the next 
coefficient x,. Thus, the calculations of these coefficients can continue as long as required. A 
bound on the number of terms necessary to determine the solution uniquely is arrived at by noting 
that the conversion of a power series to rational function form, with both numerator and denomi- 


nator of degree at most 5, requires at most 25 + 1 terms of the power series. 


The final step in producing the power series solution consists of converting the solution x(z) 
in (4.4) to an expansion about the point zero. This involves transforming the power series (4.4), 


truncated to 28 + 1 terms, into the form 
28 
AA at eee (4.6) 
k-0 


Obviously, this transformation is required only when the 2p of the first step is non-zero. 
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4.2. Cost Analysis of Power Series Solution Generation 


To find a measure of the cost of the algorithm to generate the power series solution, the 
number of arithmetic operations needed to find the solution are counted with only the dominating 


terms being retained. 


The cost arises from the three parts of the algorithm. The first step is the transformation of 
the system from (4.2) into (4.3) and the subsequent LU decomposition of Ay to determine its rank. 
If A(O) is non-singular then the system is already in the required form for the remainder of the 
algorithm, and the transformation is therefore required only when z # 0. The transformation 


requires d’N? operations for each trial z). The LU decomposition, using the methods of Chapter 2, 


requires =n operations for each such zy. 


Once the matrix Ay is found with rank N, the LU decomposition from the determination of 
the rank of Ag is available to be used to solve for the coefficients of the power series solution using 
(4.5). To solve for these coefficients, x,, requires the calculation of the right hand side of (4.5) 
before the forward and back substitution is applied. The calculation of this mght hand side 
requires 2@N’ operations for each successive k. Forward and back substitution to find each x 
requires an additional 2N* operations. Therefore, the operation count to find each coefficient x, is 


2aN? operations; thereby, to find 25 + 1 coefficients of (4.4) requires 48eN’ operations. 
Thus the number of operations to find the solution (4.4) truncated to 25 + 1 terms is 
48aN? + ae (4.7) 
whenever A(0) is non-singular. If A(0) is singular, then the additional cost for each trial zo in step 
1 to obtain the form (4.3) is 
een? + N° (4.8) 
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To convert the solution from the form (4.4) to the form (4.6) when zo # 0 requires 48° 
operations. 
Thus, using the bounds § = 4N, though better bounds are given by McClellan[13], the opera- 


tion count for finding the solution (4.6) to (4.1), where Ap is non-singular, is 


4a2N3- (4.9) 
When A(0) is singular the cost is 


43°N3 + k [20 i ave) (4.10) 


where k is such that A(i) is singular for i = 0,1, .. . ,k—-1. 


4.3. Pade Conversion of the Power Series 


Once the power series solution, x(z), to (4.1) is obtained, the next step is to convert these 
power series into rational form. The method used for this conversion is the diagonal algorithm 
described by Cabay and Kao[6]. 

The method takes a power series a(z) and makes successive approximates p,,(z) and q,,(z), 


each of degree i, such that 


a, @) az) ~ p,@) = of:"**’), (4.11) 
where qi, (2) and Pi, are in reduced from. The method is iterative on i, for k = 0,1,...,m and as 
such the maximum degree i,, of Pi,(Z) and q, (z) need not be known apriori. The iterative tech- 
nique fits in with the generation of the power series solution of the previous section as it generates 
the terms in increasing degrees. Consequently, whenever the conversion technique requires another 
term in the power series, it can be generated upon request. The entire process can be halted when 
the conversion process has reached a sufficient degree, i,,, for the rational function approximates. 

The algorithm obtains the next approximates p,,_(z) and q,,,,(z) from the previous two 


approximates, p;,(z) and q,(z) and p,_,(z) and q,_,(z), by working with their remainder power 
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series 

(2) a(2) ~ p,(z) = of2°***) = 0, (2** *) (4.12) 
and 

4,2) a2) — P,_(2) = off ©) = of (c*"-9), (4.13) 
where o; bee a indicates that the power series begins exactly with the power z**!” *. The 


approximates of degree i, ., are determined by first defining the remainder power series (4.12) as 


follows 


s = 
qi,(Z) a(z) — pi,(z) = (r© Satie cata te ri, p zeae + ofz’ ‘k+1 ‘ (4.14) 


where St = eed ie and re # 0. 


This is basically findirg the first non-zero terms, r{”, in the remainder power series and 


defining the value i, .; using r{° such that 


G2) 22) = P)(z) = Tie Ze ota aaae: (4.15) 
Thus, the polynomial r;,(z), of degree S,, is defined such that 


(S,) 8 
ri,(z) = ee + ri) 8 Ge 0 20 ti v4 : (4.16) 
. ° 5 
by calculating the remainder power series’ terms r, . . . in e 


The next step in the algorithm is to make the remainder power series (4.13) such that the 


polynomial r;,_,(z) is of the same degree as r,(z). Therefore, (4.13) becomes 
(S;) Sp) i + 
Gia) az) — jp ACA (2, a Tee ae Oca» rp 2‘) Ae! Ale (4.17) 
x Alp + aia! 


= raz) zt + ae | oe pices + dpe | 
The third step finds a polynomial b(z) of degree S,, such that b(0) = 1, and a constant c 


such that 
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c ry_s(2) = bz) n,(2) + of2***). (4.18) 
Using the method pointed out by Cabay and Kao{6], this step becomes the solution of the lower 


triangular system 


(0) 

Tip y pe) 

ae gy ” r,(> 
(4.19) 
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=i 
for the y). Then by normalizing the vector | such that b(z) = 5 ((v) y) z' and letting 


i=0 
¢= 6) 7 7 they are as required. These c and b(z) are used to compute the new approximates by 
noting that by multiplying (4.14) by b(z) and subtracting c-z#*! #-! times (4.17) from it, 25, more 
terms in the resultant remainder series are zeroed out. Therefore, this is a better set of approxi- 


mates to the power series a(z). 


This arithmetic then produces the following form. 
b(2) (a,(2) a(z) - p,@)) - ¢ 2 * (a,_,@) a) - r,_,@)) (4.20) 


(o@) a,@ - zt tg, @) ae) - (6G) yA) - eZ tp, _G)) 


= i, 2) (AC Pi, (2)> 
by setting 


Gee) PU) aq, @) cz ts ig, 2) (4.21) 
and 


i = i. 
Pi.) = (2) P,(@) — ez" ** py_,@). 


Evaluation of (4.20) using (4.14), (4.17) and (4.18) yields 
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b(z) eo zee ee ET o(? eet .) (4.22) 
We zit+1 = Yes ae oe w Ys ze o(zt ap Uys) G6 ] 
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Therefore, these new estimates p,, , (z) and q,,, ,(z) satisfy the condition set out by (4.11) and 
are of degree i, 4, . 

This process is iterated for k = 0,1,...,m+1, where i,,, , >, where completion of the 


next approximation step would create approximates of higher degree than required. 


4.4. Cost Analysis of the Pade Conversicn 

The cost of the algorithm to find the rational function solution to (4.1) using the power series 
generation of section 4.1 is calculated by noting that the costs of both algorithms, the power series 
generation and the Pade conversion, are independent. The algorithm for the Pade conversion to 
rational form requires 

68°N < 60°N? (4.23) 
operations for the N elements of the power series solution (see Cabay and Kao [6]). Therefore, 
using the estimates (4.9) and (4.10), the total cost of the power series method for computing the 


solution in rational form is 


10 a’N? +k 20 + ov (4.24) 


operations where k is such that A(i) is singular for i = 0,1, ...,k—1. 
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4.5. Problems with the Power Series Methods 


In the calculation of the solution to the system [4 (2),b(2)| in rational form, the solution ele- 


yz), 
d,(z) iu 


in reduced form. As mentioned before, the y,(z), j = 1,2,..,V, when calculated to the degree 8, 


ments obtained by the Pade conversion algorithm are rational functions, 


=i Aen 


will divide the adjoint solution elements. Similarly, the dj(z) will divide the determinant d(z) of 


A(z). However, in practice it does become desirable to express the solution in the form 


yu (Z) 
a) (4.25) 


where dj,(z) is the least common multiple (LCM) of the d,(z), j = 1,2,...,N. For example, if the 


partial solutions in the form (4.25) were available at every step k, k = 0,1, . . . ,m, then the ter- 
mination test proposed by Cabay[5] could be applied. 


One method of finding the LCM of two polynomials, d,(z) and dj(z), / # j, of degree i,, is 


d 
first to determine the power series, a(z), of their quotient, a then to apply the Pade algorithm 
j 


to it. The resulting rational form, SOL is in reduced form. Thus 


d,(z) = t(z) g(z) (4.26a) 


and 


dj(z) = s(z) g(z), (4.26b) 
where g(z) = GCD (4.(2).4,@)). Then, the least common multiple, d(z) = LM (4,(2),d,(2)] is 
given by 

d(z) = ¢(z) s(z) g(z) = t(z) d(z) = s(z) 4,2). (4.27) 

Finding the truncated power series of degree 2 i,, where i, is the maximum of the degrees of 
d,(z) and d,(z) at the k stage, k = 0,1,...,m, can be done using the method of solving the 
lower triangular system (4.19) with the two polynomials d,(z) and d,(z) augmented by trailing zero 


terms to get the appropriate degree 2 i,. The number of operations to complete this is 4 i7. Then 
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the application of the Pade conversion to the resultant power series requires at most 6 i? opera- 
tions. The multiplication of the polynomials to get common denominators and the appropriate 


numerators is bounded by 6 i? operations. Therefore, the cost of placing the two rational functions 


yz) yz) 
Gime (2) 


Over a common denominator is 16 i? operations. 


To calculate the LCM of the N denominators of the vector of rational functions, a divide and 
conquer technique can be used to reduce the growth of the power series. The degree of the trun- 
cated power series representing a quotient must be 2 i, where i, is the largest degree of the two 


polynomials making up the quotient at the k* stage. Using this technique, at the first stage 


2 i, + 1 terms are needed, 4 i, + 1 terms at the second, ..., and (alee ‘N i + 1 terms at the 


final stage as the denominators at each stage, j, have a degree bounded by 2/7' k, 
ie ., flogs N]. 


The cost of calculating the common denominator is then 
2 "| 
= |S | (2 (2-4 is) +6 (2-%4)). (4.28) 
eae m= flog, NI, (4.28) becomes 


Pudi td 


34 +6 521 (4.29) 


j=l j=l 
= 2 . mm 2 2 m 
=5i2 (2) 1) + 62 2" (2 } 
Using the fact that 2V > 2”, the number of operations required is 


232 4,30 
> 2 (2) (4.30) 
To calculate the common denominator at every step for the rational function approximations 


of degree i, for k = 0,1,2,...,m, where i, = k, in the worst case, results in an algorithm 


requiring 
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16 
Too (4.31) 


operations. Using the bound § S dN, (4.31) becomes 


16 23 
9° N°. (4.32) 
The calculation of the common denominators is, therefore, more expensive than the calculation of 


the solution elements in rational form. 


When A(z) modulo p is singular, there exists another factor in the cost of the algorithm for 
finding the power series solution. In this case, the attempt to find a zg such that A(zo) is non- 


ZL 
(p) 


guarantee A(z) is singular. As a result, the power series solution x(z) of the system cannot be 


singular fails for 8 + 1 distinct points 2 € . This, as pointed out before, is sufficient criteria to 


found as the existence of x(z) relies on the fact A(z) mod p is non-singular. Chapter 6 investigates 
the problems arising out of this inability to find a solution when A(z) mod p is singular when find- 
ing a solution of a system of polynomials over the integers. First, the cost of finding that A(z) 
modulo p is singular is investigated. 

The operation count of finding A(z) singular requires that for 8 + 1 points, zo, the system be 
transformed to the form (4.3) and the LU decompositicn procedure applied to determine that the 
rank of Ay is <N. This requires d7N? + oy Operations for each point (by 4.8). Using the 
bounds § < dN, finding out that A(z) is singular requires 

aN? + Sant (4.33) 
operations. The terms of this operation count for finding A(z) mod p singular, add a @ and N order 


of complexity to the algorithm, respectively. 


Therefore, the cost of finding that the matrix A(z) is singular over ai is greater than the 


cost of finding the solution if A(z) is not singular as 9 and N increase. The only saving grace to 


this, is that, with the random matrices, it is rare that more than one value Zo is tried before finding 
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a non-singular A(z) and even rarer that A(z) is singular. Thus, the costs involved to find an 


appropriate z) are rarely encountered and of finding A(z) singular even more so. 


CHAPTER § 


Comparison of Methods over Ty 


5.1. Introduction 


In this chapter, the power series methods of solving systems of polynomial equations over 
an are compared with the modular method on the basis of empirical timing. The implementation 


of the algorithms is done in the language ALGOLW in the multi-programming environment pro- 
vided by the operating system MTS running on an AMDAHL 580/5860. The times presented in 


this chapter are in milliseconds of CPU time on the 5860. 


The programming of the algorithms is done with the philosophy that any steps that are simi- 
lar between the algorithms use the same routines to perform these steps. For example, the LU 
decomposition of Chapter 2 is implemented in a routine called by all the methods’ procedures. 
Also since the Pade algorithm for the conversion of the power series to rational function form is 
inserted into the procedure to generate the power series solution, the code is exactly the same for 
the portion of the algorithm to generate the terms of the power series in both. In addition, the 
polynomials output from the procedures are in the same form as the polynomials of the system that 
is inputted (that is, that they are polynomials in z). 

Linked lists are used to represent the polynomials of the systems to give flexibility to the pro- 
grams when the systems get large. To avoid excessive overhead due to the system’s allocation of 
the records of the linked lists, as these records are dynamically allocated by the system, record col- 
lection routines were written and used to collect discarded records for reuse when possible. This 


garbage collection was used in all the programs. 
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The primes, p, for the implementation are chosen so that p? < L where L is the bound of the 
magnitude of single precision integer arithmetic for the computer. On the AMDAHL 580/5860, a 
32-bit machine, L = 2° — 1, thus p < +V2°"— 1. As will be seen in Chapter 6, when solving 
Systems over the integers, several p, are required. All these primes satisfy the above condition. 

By so limiting the value of p, all arithmetic is performed using single precision operations. 
That is, finding a value modulo p is done whenever an arithmetic operation threatens to have a 
result whose magnitude is larger than p and before the magnitude exceeds the bound L. In particu- 
lar, this then allows for the operation (a 2 at o} to be computed before finding the value modulo 
p is required, where a, b, and c have magnitudes less than p. The cost of finding the values 


modulo p is ignored when obtaining cost estimates for all methods. Inclusion would increase the 


cost estimates of both methods by the same factor of 5. 
For this implementation, the inverse, b, of an element a # 0 in ie is obtained by applying 


Euclid’s extended algorithm to a and p. Since a and p are relatively prime, this yields the values b 


and c where b < p such that 


ab + cp = 1. ($.1) 
Clearly, then 
b = a"! mod p. (5:2) 


5.2. Description of Testing Procedure 


The programs were set up as procedure calls that solve the polynomials systems modulo a 
prime in a manner transparent to the calling routing except for the format of the results. These 
procedures are set up in a driver that allows for the input and output of data with very little over- 
head. The input data is assumed to have coefficients that are smaller in magnitude than the primes, 
p;, where the system is solved modulo each of these primes. This driver then iterates several times 


solving the system modulo a different prime for each iteration. The amount of CPU time elapsed 
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during the procedure call was recorded and output along with the average for the CPU times 
elapsed for all the calls. 


The number of iterations performed varied with the size of the problem solved due to cost 
constraints but a sufficient number is used in every case until a stabilization of the times required 


per iteration has occurred. This allows for startup effects to become visible when they exist. 


With the power series techniques, a startup effect became noticeable in the first iteration for 
large systems. Most of the effect is likely due to the larger memory requirements of the methods 
as compared to the modular method. Therefore, a larger number of calls to the system to get the 
dynamically allocated records of the linked lists is executed at the beginning of each test for these 
methods as compared to the modular method. On subsequent iterations of the procedure, the gar- 
bage collection provided for the reuse of the records and thus the times for these iterations stabilize 
at a lower level than the first. In the calculations of the averages for the cases when a startup 


effect was noticed, the time of the first iteration is not incorporated. 


For the input data, once the size of problem (i.e., N and @ ) to be done is determined, the 
coefficients of the polynomials are randomly generated such that their magnitude is smaller than 
the smallest prime used, 45233. Each of the methods is then run independently on the set of data 
generated for each variation of the values of d and N investigated and the times collected and aver- 
aged as explained above. The data for these timing runs is presented in Table 5.1. 

It should be noted that with the power series methods, a value of z) # 0 does increase the 
cost of the algorithms. But, as was previously stated, a value of z) # 0 is rare. In fact, in the tim- 
ing runs of Table 5.1, there are no cases where A(0) mod p is singular and thus, z) = 0 for all the 
runs. 

In Table 5.1, the column labelled MODULAR refers to the time, in milliseconds, required to 
solve a particular problem by the modular method, the column labelled POWER SERIES refers to 


the time required to produce the power series solution of the same problem, and the column 
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Table 5.1 Times of Test Runs 
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Table 5.2 Times of Test Runs (cont’d) 


labelled RATIONAL FUNCTION refers to the time required by the power series method plus the 
additional time required to convert the power series solution to rational form. As a summary, the 


theoretical results of the costs of each of these methods is 


Modular 4a°N? + = ans 

S35 
Power Series 492N3 ere) 
Rational Function 10a7N° 


Recognizing that overhead significantly effects these cost estimates for small ¢ and N, the follow- 
ing observations relating to the close correspondence between theory and implementation can be 
made. 

(1) For small 4 fixed and large N, the cost of the modular method increases by approximately a 


factor of 16 when N doubles. 
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(2) For 4 fixed and large N, the cost of producing the power series solution and its rational form 


increases by approximately a factor of 8 when N doubles. 


(3) For N fixed, the cost of producing the power series solution and its rational form increases 


by approximately a factor of 4 when @ doubles. 
(4) The cost of producing the solution in rational form using the power series methods is 


approximately 3 the cost of producing the power series solution. 


The theoretical cost estimates given above suggest that for small @ and small N, the modular 
method is superior to the power series method for obtaining solutions in rational from. On the 
other hand, with @ fixed, this superiority is quickly lost as N grows in size. The crossover points 
for 9 = 1 and d = 2 are illustrated in Figure 5.1 and Figure 5.2, respectfully. Indeed, the primary 
objective for the implementations is to determine the class of problems for which each of the 
methods is superior. Using the theoretical estimates with @ fixed, the modular method should be 


superior to the power series method for obtaining solutions in rational form whenever 


4a2N3 + 4 aN* < 1032N%, (5.4) 
or equivalently, when 
N < 90. (5.5) 
Perhaps due partly to overhead and to implementation pecularities, but primarily due to the omus- 
sion of lower order terms in the theoretical cost estimates, the crossovers occur somewhat later 
than expected. In Figure 5.3 the dotted curve illustrates that the experimental crossovers actually 
occur when 
N=7.5 0 + 8.0. (5.6) 
The solid curve in figure 5.3 representing the crossover points between the power series gen- 
eration and the modular method of solution is exhibiting a quadratic trend in N. This is as 


expected since comparing the operation counts of the two methods gives 
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Figure 5.1 Timing of Methods(d = 1) 
2 aN + 4 0°N? = 40°N? + o(an’). 
Thus, 
z - 
3 N = ofa). 


Therefore, 
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(5.7) 
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d=c N+ o(N), 


for some constant c. This cancelation of the terms of order 9’N? in the difference of their opera- 
tion counts results in the quadratic crossover curve for the two methods. Thus, as N increases, the 


power series solution quickly becomes the least costly of the two methods for a fixed degree @. 
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Figure 5.2 Timing of Methods (9 = 2) 
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5.3. Conclusions 


For solving systems over wil if solutions in rational form are required, the results of the 


(p) 
previous section show that for small @ the power series method, together with the Pade conversion 


algorithm, is superior to the modular method. The converse is true for large @, that is, d = = N: 


This anomaly between the choice of methods does not exist if solutions in truncated power 
series form only are required. The class of problems for which the power series method is superior 


to the modular method grows quadratically with N. 


It can be observed that, in producing the solution in rational form by the power series 


method, 60 per cent of the cost is attributed to the Pade conversion algorithm. The conversion of 


d = cN? + o(N) 
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Figure 5.3 Crossover Curves for Modular vs Power Series Metheds 
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one truncated power series requires 68° operations, where 8 is the maximum degree of the polyno- 
mials in the resultant rational function. Better algorithms, which use fast polynomial arithmetic 
methods and which are of complexity o(8 log’6), are currently under theoretical and practical 
investigation ( see Brent, Gustavson and Yun[3], Verheijen({15] and Choi[8] ). Using these 
methods, the cost of producing the solution in rational form by the power series method is 

40°N? + c:dN* log?(aN), (5.8) 
for some constant c. Thus, using fast methods, the Pade conversion algorithm does not add to the 
overall cost. Practically, however, the constant c is large, approximately 93 according to 
Verheijen[15], and the advantage of fast methods is again not realized until N is large. The deter- 
mination of the crossover points where, using fast methods, the power series method becomes 
superior to the modular method for finding the solution in rational form is left as a subject for 


future research. 
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CHAPTER 6 


Solving Systems over Z[z] 


6.1. Description of the Method 
Given the matrix A(z), of order N, and the vector b(z), of length N, where their entries are 
univariate polynomials with coefficients over the integers, finding the solution x(z) such that 


A(z) x(z) = b(z) (6.1) 


can be done using the methods of the previous chapters in conjunction with the CRT for integers. 


The CRT for integers allows for the construction of the integer a from its residual modulo 


the distinct primes Pos Pis Pas + ++ Py 


u, = u mod p;,i = 0,1,2,...,%7, (6.2) 
such that u is uniquely determined in the range 0 to pgpip.- ~~ p, — 1 (or when a range sym- 
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5 5 ). The 


metric about 0 is required, then in the range — 


Newtonian version of the CRT algorithm is used for this construction. Define the value w'*! at the 


k* stage to be 
k-1 
yl) = ylk-H + A [lz (6.3) 
i=0 
where 
dy = ull = uy mod po (6.4) 


a = ((u = ult-1)) x s) mod P, 
and 


k- 


k-1 1 
So = TT (27) mod Pp, @ L P| mod Pr. (6.5) 
i=0 


i= 


To ensure ul*] is symmetric about 0, the a,’s must be such that 
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(If wit! is required to be positive, the a, must be positive.) 


The result is the integer ul") in the mixed radix form 


Y k-1 
ull = 2a [it = a + apo + aypyy, + ° ++ + appr: * Py-1, (6.6) 
- i= 
where 
ull 
Oy = 2, i=0,1,....4. (6.7) 
When 7 is such that 
Pee Pp Pe (6.8) 


then u = u), 
To obtain the solution of (6.1), the first step is to determine for each prime 


,y, the matrix A;(z) and the vector b,(z) such that 


Dreher 
(6.9) 


A(z) = A(z) mod Pp; 
b,(z) = b(z) mod p). 


The next step involves solving the system 
(6.10) 


Aj(z) x;(z) = B(z) mod p,, 
for each prime p,. Using the modular method, this results in the adjoint solution [itz ,4,(z)| of 


(6,10);.where, for? = 0)1,.5., 1,77, 
yz) = A(z) b(z) mod p, (6.11) 
Z 
—— and 


is a vector of polynomials over 
(P,) 
(6.12) 


di(@) = det(4(@)) mod pr , 
oy Using the power series method, together with the Pade conversion algo- 
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wZ=AG) bay. r=01. ey. (6.13) 


The numerators and denominators in the vector x,(z) are, therefore, polynomials over se Thus, 


») 

with some regard for "bad primes", p,, which are addressed in section 6.4, the modular method 
gives 

y(z) = yi{z) mod p; (6.14) 

d(z) = d(z) mod p,, 
and the power series method gives 

x(z) = x,(z) mod p;. (6.15) 
In (6.15), it is intended that congruences are taken seperately for the numerators and denominators 


of the components. 


Therefore, regardless of the method used, each coefficient in the polynomials in (6.11) and 
(6.12), or in (6.13), is computed modulo p,, i = 0,1, . . . ,y. In the third and final step, the CRT 
is applied to each of these coefficients. This yields the integer coefficients in the polynomials of 


y(z) and d(z), or in x(z), as long as the integers are all bounded by 


Ri BORG aT 
ea 
Such a bound is determined in the next section. 


6.2. Cost Analysis 

To get a measure of the cost of finding the solution of (6.1), the first step is to find the cost 
for the reconstruction of a single integer coefficient from its residues. This reconstruction is done 
using modular arithmetic to avoid multi-precision arithmetic. 

Let -y be the bound on the number of primes required to represent the solution’s coefficients, 


such that the coefficient, u, satisfies (6.8). It is advantageous for the primes po ,p;, . . . ,py to be 


ordered such that p, < p;+1, i = 0,1, . . . ,y—1, as will be seen in section 6.3. 
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The cost of calculating the integer u!*! at the k" stage (i.e., for the prime p,), using modular 
arithmetic, involves the calculation of u*~"! mod p, and requires 2(k - 1) operations. The addi- 
tional operation of finding (6.4), given that the S“ are precalculated, means the cost of finding 
(6.3) is 2k operations. Thus the calculation of a coefficient u in the form (6.6) requires -y? opera- 
tions. 

For each of the methods used to solve the system over the integers modulo p,, the cost 
varies. To calculate the third step of algorithm, the CRT application, the number of polynomials 
in the solution and their degrees must be taken into account. Therefore, using dN as the bound on 
the degree of the polynomials of the adjoint solution, the number of operations to apply the CRT 


for each method are 


Modular Solution aN? y? 
Power Series Solution 2dN*y? (6.16) 
Rational Function Solution 2aN>y? 


Using the cost analysis of Chapters 3 and 4, the total number of operations required to find 


the solutions modulo the y + 1 primes ( the second step ), is 


Modular Solution 4a°NPy + 2 aNty 

6.1 
Power Series Solution 40°Ney (6.17) 
Rational Function Solution 1087N?y 


Given that the magnitude of the coefficients of A(z) and b(z) is bounded by v, let t be such 
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Since finding the multiprecision coefficient v modulo the prime p, require t + 1 operations, the cal- 
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culation of (6.9) for the y + 1 primes, (the first step), requires ty3N” operations. 
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Using the bound on the size of the coefficient of the solution given by McClellan[13], y is 


such that 


NiawWe et 
It follows then that t<<y, In fact y can be approximated by y ~ tN, by retaining the dominating 


terms only. 


Thus the cost of finding the solution over the integers of (6.1) using each of the methods, in 


number of operations, is 


Modular Solution areN* + 4a?tN* + San" 
Power Series Solution 2at?N* + 4971N* 9) 
Rationa! Function Solution 2at2N* + 1007tN4 


6.3. Terminating the Iteration 


When calculating the solution to the system of equations of polynomials over the integers, 
unnecessary computation is done when the bound, y, is larger than the actual number of primes 
required. Theorem 6.1 is designed to terminate the iteration when sufficient accuracy has been 
achieved in the integer coefficients of the adjoint solution @,a@) to (6.1) to guarantee a solu- 


tion over the polynomials over the integers. 


The theorem is an extension to polynomial systems with integer coefficients of the theorem 


presented by Cabay[4] for terminating the recursion for integer systems. 


Theorem 6.1 
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Suppose A(z) = [a,(2)| is a matrix of order N and b(z) = [a2] is a vector of length N 
with 
g,(z) = af + az + az? +--+ + afez? (6.19) 


bj(z) = bf) + biz + bf )z? + +--+ + biz? 
where @ is the maximum degree of the polynomials in A(z) and b(z) and the coefficients a{*) and 


b{*) are integers. Suppose A(z) and b(z) are such that 


N é 
z= 2 al |= pappr*** Py (6.20) 
j=~1&=-0 

3 

> [ae = PwWiPr*** Py 

k=-0 


forally, 1c= 7s N. 
If the mixed radix form of the adjoint solution (y(z),d(z)) is 
yz) =») + (6.21) 
: k 
SA (Oper: - pu + °° + Oper: - Pury + yhP-Pars ** * Pareyer) 
k=0 


d(z) = d,(z) + 


8 
D2 (Ope SP Py et Opis uy en Po Pury) 
k=0 
where 
8 8 
yl) = SF (6) + ype t <-> + Ppa -- Pw) = DA!” (6.22) 
k=0 k=0 
8 8 
d,(z) = D2 (ay? + d{):po+ +++ + di-pwp.-- : pus) = eae 
k=-0 k=0 


with being the maximum degree of the solution and yj/*) = [i], a vector of length N with 
entries from the integers mod p, such that 
ja 


yf) 


< (p,-1)2 (6.23) 


< (n- 2 
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A(z) y:(z) = d,(z) b(z). 
Proof 
Suppose the contrary. This implies 


A(z) y,(z) — d,(z) b(z) #0 
but 


A(z) y(z) — d(z) b(z) = 0. 


Thus, there exists ani, 1 < i <N, such that the ** element of the vector (6.26), denoted 


4@y@ - 4@) se), 


there exists a polynomial c(z) with integer coefficients such that 


[4(@) ve(2) - dele) @)|, = (2) = > Oe £0 


where 
8 
yr(z) = Sone 
k=0 


FLD ers, 


such that 


0= [A@ y@ - 4@0@)], = [A@x@ - 4@ 40), - wr - 


Denote the coefficient of the k* term in the polynomial 


[A(2) (2) - 4(2) 60), 
by 


[4(2) (2) - 4@) b@)], 
Thus, 


* Pm +y+1: 


(6.24) 


(6.25) 


(6.26) 


(6.27) 


(6.28) 
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(6.30) 


(6.31) 
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and, using (6.27) and (6.29), this gives 


0 = De ([A@ »@ - 4) 6@)) + paps « « Pueyes) 


This implies that for all k such that c # 0, that 


[4@ »@) ~ 4) 6)? = - craps «+ Parry 
Thus, 
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But, 
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-(5 [pe & ae (a) »)) = [37 > d\") i) 


8 
= 3-((5. 2 = a xf) - [ a" w 
k=0 =1 I+ t+hok 


Combining (6.32) and (6.36) with (6.22) results in 
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(6.32) 


(6.33) 


(6.34) 


(6.35) 


(6.36) 


(6.37) 
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Working with (6.19), (6.23) and (6.37a) results in 
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Combining (6.37) and (6.38) gives that 
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Therefore, 
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since the primes are ordered. 


Thus, (6.41) and (6.35) constitute the contradiction. 
QED 
The theorem is designed to be applied at stage M + yy +1, where M and y are defined as in 


Theorem 6.1, when the last y + 1 applications of the CRT to each coefficient of the polynomials 


of the solution 6.4) has yielded only 0 terms in the mixed radix representation of the 
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coefficent. 


6.4. Bad Prime Problems 
When solving the polynomial systems with integer coefficients using the methods of Chapters 


3 and 4 for solving the polynomial systems with coefficents in uae occasionally some problems are 


(p) 
encountered with bad primes. 
With the modular method the bad prime can only occur when Theorem 3.1 is utilized to ter- 


minate the iteration prior to finding the complete adjoint solution. Consider the system 


A(z) y(z) = d(z) b(z) mod p,, (6.42) 


where A(z) and b(z) mod p, have degree at most 4. If Theorem 3.1 was applied to terminate the 
iteration while solving (6.42) then the system has its solution solved to a degree M + @, for some 
M as defined in Theorem 3.1. Assume, as well, that when solving the system (6.1) modulo another 
prime, p;, p; # p;, the solution modulo p, has degree 8 where 8>M + @. (Note, if the system 
modulo p; used Theorem 3.1 to stop the iteration, the degree 6 is defined since M is defined in the 
theorem.) Thus to apply the CRT for the integer coefficients using these 2 systems, the coefficients 
in the solution mod p, consists of the terms of degree M + d + 1 and greater. But as pointed out 
in the example in Chapter 3, the nature of the terms of degree M + @ + 1 and higher is unknown 


when the theorem is applied. 


It is relatively simple to get around the problem when j<i as the algorithm to solve (6.42) 
can be forced to continue until polynomials of degree 6 are obtained. Thus, the coefficients to 
degree 5 can be constructed using the CRT. 

When i<j though, the problem becomes more complicated. When this happens, for all previ- 


ous primes p,, k<j the system mod p, has been solved only up to a degree t with t < 8. Thus, 


the integer coefficients of the terms of the solution of the degrees t + 1 to 5 cannot be constructed 
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using the information available. One method to find the solution when this happens involves dis- 
carding the solution so far constructed and starting the entire solution process over again with the 
prime p, as the first prime in the sequence of primes. To do this shifting of the primes, the precal- 
culated inverses, S“) , must be recalculated using (6.5). A second method to find the solution 
when a bad prime is encountered is to restart the solution process at the first prime, pp, but ensur- 
ing that the solution modulo p,, i < j, are guaranteed to be at least degree 8. This, again, involves 
discarding the solution thus far computed unless the solutions mod p,,i < j, are retained in the 


state in which the algorithm terminated, for each prime p,. 


The power series form of the solution suffers when the degree of the solutions mod p, vary 


for different p,. Since the degree of the truncated power series outputted from the method over 


—_ is dependent on the degree required to form the rational function approximates of sufficient 


(?) 


degree, the degree of power series solution can vary. To get around this requires that the degree of 
the power series from each intermediate step (i.e., mod p;) be the same. This can be done by deter- 
mining the degree of the truncated power series with integer coefficients and ensuring the inter- 


mediate steps solve the power series mod p, to this degree. 


With the power series methods of Chapter 4, there exist other types of bad prime problems. 
One, that occurs when the matrix A(z) mod p, is singular, was discussed in Chapter 4. In this case, 
the power series solution to (6.1) (which is used in the Pade conversion method) does not exist. 
When this occurs, the prime, p,, must be discarded and the next one in the sequence is used. In 
addition, the cost of finding A(z) mod p, singular which is higher than finding the solution of a 
non-singular system, is discarded with the prime. All that is required to determine if A(z) singular 
is that A(z) mod p, be singular for y + 1 primes p,. 

With the rational function conversion of the power series, there exists another bad prime 
problem. It comes about because the conversion of the power series to rational function form 


returns the rational functions in reduced form. The problem occurs when factors are removed from 
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the rational function solution over — for some prime p, that do not exist in the rational func- 


(p) 


tions with integer coefficents. 


To illustrate the problem, consider the adjoint solution (v2),a@)), d(z) # 0, of (6.1) with 


integer coefficients. The reduced rational function form of the solution with integer coefficients is 


then 

y'i(z) — 

d',(z) fee een NV (6.43) 
where 

yi) = —— 2) as 

Gcp( »,(2).d(@) ) 
a 

ue Gc ( y(2),d@) ) 
and 

GcD(y'(2),4',(@) } =i, ie ‘ar (6.45) 
The problem occurs when 

ecp((y'@) mod p ) (4.@) mod p)) = n(z) (6.46) 


where deg (} > 0 for some i. Thus, the conversion of the power series solution over 2 yin 


(p) 


return the rational function with the factor n(z) removed. But this factor did not exist over the 
integers and thus the degree of the rational function over a is less than the degree of the 


corresponding rational function over the integers and its coefficients are not necessarily the residu- 
als mod p of the coefficents of the corresponding terms of the rational function over the integers. 
In addition, the factor n(z) cannot be determined from the power series or its rational func- 


tion. If the factor did not exist in all N of the rational functions of the solution, it can be deter- 


mined by calculating a common denominator of the rational functions but if the factor was 
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removed from all N rational functions it cannot be determined. Thus, there is not any method of 


recovery when this type of problem occurs other than discarding the prime. 


Thus, when a bad prime is encountered, it generally means that a sizeable amount of calcula- 
tion has been done that cannot be used. In addition, as in the last case, the determination of 


whether a bad prime has been encountered or whether the solution is valid can be difficult. 
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